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Abstract 

Mass measurements of objects that decay into hadronic jets, such as the top 
quark, are shown to be improved by using a variant of the kt jet algorithm in 
place of standard cone algorithms. The possibility and importance of better 
estimating the neutrino component in tagged b jets is demonstrated. These 
techniques will also be useful in the search for Higgs boson — ► bb. 
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I. INTRODUCTION 



It is often necessary to measure the mass of an object that decays into hadronic jets. An 
example of current importance is the decay of the top quark t — > bW, where the b quark 
materializes as a jet and the W boson decays either leptonically or into two light-quark jets. 
The accuracy with which the jets can be measured governs the error in the top quark mass 
measurement, which is crucial to the study of electroweak physics — e.g., knowing mt allows 
a logarithmic estimate of the Higgs boson mass in the minimal model. Accurate measurement 
of the jet decays of the W is also valuable here because good W mass resolution can reduce 
the combinatoric and other backgrounds in the analysis. Futhermore, the hadronically 
decaying W can provide an alternative measure of m t based on the jet angles in the top 
rest frame Q: these angles determine m t /mw in each event with errors that are largely 
independent from the errors of the traditional measure, so the two methods can be averaged 
to improve resolution. At the same time, tt events offer a sample of hadronic W decays 
that can be compared against the known W mass to test the theoretical and experimental 
assumptions underlying all jet spectroscopy. This opportunity is unique because hadronic 
W decays are otherwise obscured by large QCD backgrounds and triggering problems [fj,[3| . 

A second important application of jet spectroscopy occurs in the search for Higgs boson 
— > bb. A moderate improvement in dijet mass resolution has been shown to extend the range 
of possible discovery to mHi ggs — 80 — 100 GeV/c 2 in Tevatron Run II 0. 

The important sources of error in jet spectroscopy are (1) QCD radiation and hadroniza- 
tion effects, (2) jet definitions, and (3) detector effects. We will compare these sources of 
error quantitatively, using Monte Carlo simulation events for which the true partonic mo- 
menta are known, and we will study the degree to which the jet finding algorithm can be 
improved. There is an interplay between the first two sources of error because acceptable jet 
algorithms differ from one another at next-to-leading order in a s and in the nonperturba- 
tive hadronization corrections they require. Previous top quark analyses || have used cone 
algorithms for jet definition [§,0. But I will show in this paper that a particular version of 
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the k± successive recombination algorithm ||[| instead promises superior results. 

The detector effects studied here are generic ones that arise from the basic segmented 
calorimeter design of all contemporary detectors. Particular attention is paid to the unseen 
neutrino component of b jets, which is found to be significant and partially correctable. 
Dealing with the additional foibles of each specific apparatus must be left to the experimen- 
talists. 



II. SIMULATION 

Throughout this paper we investigate the experimentally favorable single-lepton (£ = e 
or fi) top quark channel pp — > tiX with t — > W + b — > jjj and i — > W~ b — > £~ i/f j or their 
charge conjugates at the present Tevatron energy y/s = 1.8 TeV. The results also apply 
rather directly to the six-jet channel where both t and i decay hadronically. 

Because of color confinement, the quarks from top decay show themselves as jets of 
hadrons [|10j. One must infer the momenta of the quarks from measurements of the observed 
jets. Because of the collinear and soft singularities of QCD, a quark naturally shares its 
momentum with accompanying gluons and/or qq pairs. It is necessary to include these as 
much as possible in order to capture the momentum of the original quark. Sometimes the 
QCD radiation is so hard as to produce an extra separate isolated jet. In such events, 
reconstructing the mass of the original state is generally hopeless because the number of 
combinatoric possibilities resulting from the many possible sources of extra radiation is so 
large. In many events, however, the effect of the QCD radiation is simply to broaden the 
jets in the (r/, 0) plane. 

Some of this territory has been explored previously |IT |. However, we use here a signifi- 



cantly improved simulation program with an up-to-date estimate of the top quark mass, and 
make a fuller study of the effect of different options and parameters in the jet definitions. 
Also, we include the step of making "jet energy corrections" which has become standard 
experimental practice. 
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A. Event generation and cuts 



Events were simulated using the HERWIG 5.8 [12] Monte Carlo event generator, which 
models both hard and soft QCD effects. HERWIG is known to agree well with jet data from 
e + e~ interactions at values of Q 2 comparable to those that arise in top quark decay p"3 



It also agrees well with next-to-leading order perturbative calculations of the distributions 
in pj_, rj t , and m t t for ti production [14"]. The default HERWIG parameters were used, but 
I have checked that substituting parameters that have been tuned to fit jet data from 
e + e~ interactions [O] causes negligible change. HERWIG does not include decay correlations 
between the t and t Hl5f , or the finite width of the top; but these effects are probably not 
important for our purposes. 

Using HERWIG for top production is not without risk in view of discrepancies with pertur- 
bative calculations that appear specifically for top quark production [|l6j. I have incorporated 
a "bug fix" recently circulated by the authors of HERWIG JlT]], which substantially increases 
the amount of hard gluon radiation in top decays and removes the strong discrepancy shown 



in Fig. 2 of Ref. (T§. 

I assume m t = 175 in the simulation. To approximate standard experimental cuts, I 
restrict the discussion to events in which the lepton from W decay has transverse momentum 
p\_ > 20 and pseudorapidity \rj\ < 2, and its neutrino has p\ > 20. These cuts keep 73% of 
the single- lepton ti events. (Units with GeV = c = 1 are used throughout this paper.) 

Fig. 1 shows the p± distribution for the two b quarks and the two quarks from W decay. 
Typical values are comparable to those for which HERWIG has been tested and tuned using 
data from LEP ||13|| . I impose a cut requiring all four of these partons to have p± > 20. This 
cut keeps 67% of the events that pass the lepton cuts. It is intended to simulate the effect 
of a cut on the minimum p± of the four highest p± jets observed in each event. The cut is 
made at the parton level in this simulation so that the different jet algorithms are compared 
fairly, by applying them to the same set of events. The partonic cut should be very similar 
to experimentally possible cuts on observed jet p± — at least for the events that contribute 
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to the signal, for which the four highest p± jets in fact correspond to the primary partons. 

The reduction in signal due to a fairly strong cut on the minimum p± of the observed four 
primary jets is a price worth paying, particularly as the total number of observed events 
rises, for several reasons: (1) It avoids the need to measure jets of low p±, which have 
intrinsically large fractional uncertainties as is quantified below; (2) It increases the fraction 
of events for which the observed jets will be correctly matched to their original partons, 
especially since only the four jets with highest p± observed in each event will be analyzed to 
reduce the combinatoric background in assigning the jets; and (3) p± cuts have been shown 



effective in suppressing the major background from W + jets processes without tt [18 



B. Detector models 

The detector is modelled as an array of 0.1 x 0.1 cells in pseudorapidity i] = — In tan 9/2 
and azimuthal angle <fi. This granularity in the (77, 0) plane is similar to that of the current 
D0 detector, while CDF detector cells have width 0.26 in cf). The detector is assumed 
to have no ability to identify particles, so the energy deposited in each cell according to 
the simulation is analyzed as if it came from a massless particle whose momentum direction 
pointed toward the center of the cell. (In real life, corrections must be made for the spreading 
of energy into neighboring cells due to the finite size of the shower generated by a single 
particle. This spreading also creates a possibility in principle to locate the direction of 
momentum more accurately than the cell size would predict.) 

We consider three different models for the energy resolution of the detector cells. In 
model A (Ideal), the total energy deposited in each cell is measured exactly, even including 
the contribution from neutrinos. In models B and C, the total energy in each cell is smeared 
by realistic gaussian errors of standard deviation AE given by 

AE [7? " 

with ci = 0.55, C2 = 0.03 for charged hadrons (mostly 7r ± ) and C\ = 0.15, c 2 = 0.003 for 7, 
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e or fi (mostly 7 from 7r°). These parameters are approximately those of the D0 detector 

Models B and C differ only in that neutrinos are treated like electrons in B, while in 
C the detector is blind to neutrinos like a real detector. The purpose for this distinction 
is that we will find a sizeable difference between these two models because of the frequent 
presence of neutrinos in b jets, and it may be possible to compensate for some of the neutrino 
component on an event-by-event basis using leptonic information that is acquired as a part 
of some b tags. 

Cells that receive p± < 0.75 are ignored in the analysis. This mimics a limitation of the 
D0 detector due to noise levels from its uranium calorimeter. But it may be a good idea 
anyway to drop contributions from very low p± particles, which are at best poorly associated 
with any jet direction in part because of hadron resonance decay effects and the difference 
between rapidity and pseudorapidity; and because extraneous low p± particles are present 
from soft hadronic interactions that are additional to the hard scattering that produced ti 
("background event") and from independent pp interactions at high luminosity ("pileup"). 
The dependence on this p±_ threshold will be discussed in Sect. |11 h| . 

Additional limitations that depend on experimental details of real detectors, such as 
differences in the response to charged and neutral particles in a shower, nonlinearity of 
that response, small regions where there is no response, etc., are not included here. The 
mass resolutions we find therefore represent an optimistic limit for what can be expected. 
However, the neglected effects are generally small compared to those included, and they 
should in particular not affect our conclusions on the relative merits of different methods of 
analysis. 

C. Jet definition 

For jet spectroscopy, I advocate a particular version of the k± jet finding algorithm |5||| 
that is defined by the following explicit steps. 
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1. Begin with a list of "jets" that consists simply of the four-momentum from each cell 
above the p± > 0.75 threshold, treated as a zero- mass particle. (There are typically 
~ 40 — 60 such cells, but more in a real detector where the energy of a single particle 
is spread over several cells.) 

2. Compute dj for each jet and cfy for each pair of jets, where di is the jet transverse 
momentum and 

dij = min((ij, dj) AR/R Q (2) 

where 

AR = ^- f7j )2 + (0._0.)2 (3) 

is the angular separation in the (rj, 0) "Lego" plane. The parameter R was introduced 
in Ref. [|9] to generalize the k± algorithm. It sets the scale for the size of the jets in 
the (rj, 0) plane. Although it does not create a sharp cutoff, cells that are farther than 
Ro from their final jet axis seldom contribute. In this analysis, I mainly use R$ = 1, 
which corresponds to the original algorithm. The dependence on R Q will be discussed 
in Sect. fTE\. 

3. Find the minimum of all {di, dij}. If the minimum value is less than P°, the procedure 
is finished and the current list contains the final jet momenta. This termination rule 
is different from some other versions of the k± algorithm. The parameter P° defines a 
hardness scale at which the algorithm terminates. In particular, the final jet list will 
contain no jets with p± below P£. I find that P±_ = 10 GeV/c works well for the top 
quark analysis. 

4. Otherwise, if the minimum is a di, that jet is deemed to be a fragment of one of the 
original beam particles (initial state radiation) and it is dropped from the list. 

5. Otherwise, the minimum is a dij. That pair of jets is combined into a single jet by 
adding their four-momenta. 
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(The simple choice of adding the four-momenta to combine protojets has an obvious 
good feature that the invariant mass of a multi-jet object will be stable with respect 
to changing the assignment of a cell or group of cells from one jet to another within 
the object. A customary alternative to this choice is to combine protojets according 
to the "Snowmass Accord" |7|] formulae 

P± = P± + Pi (4) 
V = (Vi P\ + Vj P±)/ (P± + P±) (5) 
<P = {<f>i Pi + <Pj Pi) I {A + Pi) (6) 

where <j>j must be shifted by ± 2ir here and in Eq. (|3p if possible to minimize |0j — <pj\. 
I find this rule to give slightly poorer mass resolution than simply adding the four- 
momenta.) 

6. Go to step 2. 

Only the four highest p± jets found by the k± algorithm are used in the analysis. This 
causes a very small fraction (~2%) of events to be dropped immediately because fewer than 
4 jets are found. This can happen even though we are looking for jets down to p± = 10 
from partons with p^ > 20, because one jet can split into two or more by hard radiation, or 
because two jets can lie so close together in (r], <fr) that they appear as one. (It will eventually 
be desirable to keep more than the four highest p± jets, to allow for initial state radiation at 
higher p± than one of the four primary decay partons or hard radiation from the t, i, b, or 



b [p0[ , in order to test our understanding of QCD radiation; but because of its combinatoric 
richness, this will not be helpful for the mass measurement.) 

The four hardest jets are matched to the four original parton momenta, which are of 
course known in the simulation, by trying all 4! = 24 assignments and keeping the one with 
the smallest root mean square error in fitting the 4 parton directions in the (r], <ft) plane. 
The jet energies are not considered in this matching process, so as not to bias our study of 
the accuracy of jet energy measurement. 
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The distribution in the rms error of the best fitting assignment shows a strong peak at 
small values, above a background that extends to large ones. We impose a cut < 0.8 on the 
total rms error, which is equivalent to a cutoff at ^ 0.4 for the average deviation in (77, </>) 
from each of the four parton directions. This cut keeps 67% of the events. The events it 
removes are mainly those in which the four highest p± jets are not the right ones because 
of initial state radiation of a gluon with higher p± than one of the top decay quarks. Thus 
our procedure of keeping only the four jets with highest p± captures the desired two b jets 
and two W decay jets about 2/3 of the time. 

The events that survive the rms fit cut are used to study the p± resolution for jets, and 
the resulting mass resolution for t — > bW — > jjj, in the next two sections. To compare the 
effects of different jet algorithm parameters or detector parameters fairly, the location of the 
cut is adjusted slightly to keep the fraction of events that pass the cuts constant. 

D. Jet energy resolution 

Figs. 2-4 show the ratio p^ et /pP arton a t ^Parton ^ r-p^ solid curves are for jets from W 
decay (light quarks), while the dotted curves are for b jets. The three Figures correspond to 
the three models for calorimeter energy resolution: Fig. 2 assumes perfect resolution, while 
Fig. 3 and Fig. 4 both include the realistic energy resolution given in Eq. ([!]). The detector 
is assumed capable of detecting neutrinos in Figs. 2 and 3, while it is blind to them in Fig. 4. 

All of the curves peak at p^ et /pP arton below 1 because of the assumed p± threshold of 
the cells and because QCD radiation can cause a significant fraction of the jet energy to 
appear at large angles where it is omitted by the jet algorithm. The peaks in Fig. 3 are 
more than twice as wide as the peaks in Fig. 2. This indicates that the energy resolution of 
the calorimeter cells is the major source of error in the jet energy measurement: e.g., if the 
QCD and calorimeter cell size errors included in Fig. 2 and the resolution errors were equal, 
the peak width would increase only by a factor \pl in going from Fig. 2 to Fig. 3. 

Fig. 2 shows only a small difference between b jets (dotted) and the light quark jets 
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from W decay (solid). The difference remains small when energy resolution is included 
in Fig. 3. In going from Fig. 3 to Fig. 4, there is almost no change in the W decay jets 
(solid), as expected because there is not much neutrino component in light quark jets. But 
a dramatic difference appears between Fig. 3 and Fig. 4 for the b jets (dotted). The loss in 
b-jet resolution due to varying amounts of missing neutrino energy is very significant. It will 
therefore be useful to investigate the possibility of correcting for the neutrinos on a jet-by-jet 
basis, using information that is acquired as a part of b-jet identification. 

To study the dependence on partonic p±, we can characterize peaks like those shown 
in Figs. 2-4 by the value of p^ et /pP arton corresponding to the 50 th percentile (median) of 
the distribution, and the values corresponding to the 16 th and 84 th percentiles which define 
the middle 68% of the probability distribution. These would be the ± la points if the 
distributions were Gaussian. The result is shown in Figs. 5-7, expressed in terms of the 
difference pf et — p^ Tton instead of the ratio for convenience. 

One sees that the 50 th percentile curves in Figs. 5-7 can be reasonably well approximated 
by straight lines. Those straight line fits can be used to make average "jet energy corrections" 
of a linear form 

p Parton _ ^ + Bp Jet ^ 

to better estimate the partonic energy from an observed jet energy. The appropriate pa- 
rameters A and B are somewhat different for b jets and W-decsy jets, and vary with the 
parameters of the jet algorithm. 

After average jet energy corrections have been made, fluctuations from jet to jet remain 
due to different amounts of QCD radiation falling outside the identified jet. These fluctua- 
tions contribute to the energy resolution errors, and hence to the width of peaks in multi-jet 
mass distributions. The "± la" spread in pj_ ct — pP arton i s se en in Figs. 5-7 to grow only 
slowly with pP arton ; so the fractional accuracy of the p±_ measurement improves significantly 
with increasing p±. The spread in p± l — pP arton is larger for b jets. This is dramatically so in 
the case of the most realistic detector model C, which admits the possibility of large energy 
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escape in the form of neutrinos. 



E. top quark mass resolution 

We concentrate on the mass measurement of the hadronically decaying top, since it is a 
good example of "jet spectroscopy" in general, and since the treatment of the leptonically 
decaying top is complicated by errors in the measurement of the neutrino momentum. (The 
transverse momentum of the neutrino is inferred from missing p±, which can be strongly 
affected by detector imperfections and by the presence of neutrinos in the b or c jets. The 
longitudinal momentum of the neutrino is subsequently obtained by assuming m&, = rriw, 
which acquires serious uncertainties from the error in and the finite W width in addition 
to the two-fold ambiguity in the sign of r\ v — T]g.) 

Three-jet mass distributions from t — > bW —>■ jjj are shown in Fig. 8 for the three models 
of calorimeter energy resolution. In generating these histograms, the best match to the four 
parton directions was again used to infer the jet assignments. But this time the best-fitting 
assignment is plotted for every event, without a cut on the quality of the fit. This makes 
the simulation more realistic, since it includes backgrounds of a type that will be present in 
actual data analysis. The jet assignments are needed to know which three of the four jets 
come from the hadronic top decay, and also because linear jet energy corrections are made 
using Eq. (|7|) with parameters A and B that are slightly different for b jets and light-quark 
jets according to Figs. 5-7. 

Thanks to the jet energy corrections, the peaks are centered very close to the input value 
m t = 175. Their shapes are not symmetrical, but are instead skewed toward low masses 
since QCD radiation and loss due to neutrinos can substantially reduce the observed energy 
of a jet, but cannot increase it. The widths of these peaks can be measured by fitting 
the histograms to a Gaussian plus a linear background over the fairly narrow mass range 
160 < Mjjj < 190: this is useful for purposes of comparison, even though the resulting 
fits are not statistically adequate at the high statistics at which the histograms have been 
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computed. The resulting gaussian peaks correspond to standard deviations of AM = 4.0, 
7.3, and 9.1 for the three models of resolution. Fitting over a different mass range results in 
somewhat different numbers, but leads to the same qualitative conclusions. 

The mass resolution for m t can be improved by replacing the usual invariant mass es- 
timate, which is based on the sum of the 4-momenta of the three jets, by the average of 
that value and a mass estimate based on the jet angles in the top rest frame [jlj. Three-jet 
mass distributions obtained using this average variable are shown in Fig. 9. These peaks 
are more symmetrical than those of Fig. 8 because fluctuations in the jet angle part of the 
mass measurement have no definite sign. The peaks are narrower in each case, with widths 
AM = 3.9, 5.7, and 7.3 for the three models of resolution. This demonstrates the value of 
the jet angle method. 

The dependence on the assumed calorimeter cell threshold is not large. For example, 
raising the threshold from p± > 0.75 to p± > 1.00 increases the width of the mass peak by 
only ~ 5% in the case of model B for the energy resolution. Similarly, lowering the threshold 
to p± > 0.50 narrows the peak by ~ 5%. The actual effect would be even less than that 
because the "background event," which contributes random noise at low p±, has not been 
included in the simulation. 

The dependence on the jet radius parameter Rq of the k± algorithm is also not large. 
The original choice R® = 1 is found to be close to optimal. Going to Rq = 0.8 or Rq — 1.2 
results in mass peaks that few percent broader. 

One might wonder if the k± algorithm could be improved in some cases by revising its 
assignment of cells to jets according to their proximity to the jet axes it finds. To test this, 
the following plausible modification was tried: After completing the work of the k± algorithm 
on each event, any cell above the p± > 0.75 threshold was reassigned to the nearest of the 
four highest pj_ jet axes if the cell was within 0.7 of that axis and (1) it was previously 
assigned to a different jet whose axis is farther away than this new one by a factor > 1.2 , 
or (2) it was previously not assigned to any jet. This modification affected only 17% of 
the events, almost entirely through option (1). It produced a small improvement in energy 
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resolution for that subset of events, but the improvement was not large enough to make it 
worthwhile to "second-guess" the k± algorithm in this way. 



III. COMPARISON WITH CONE ALGORITHMS 

The analysis of jet data at hadron colliders has traditionally been done using cone al- 
gorithms, in which a jet is defined as the final particles within a circle of fixed radius R 
in the (77, </>) plane. A typical cone size is R — 0.7; but smaller values like 0.4 have been 
used for processes like ti production, to improve the sensitivity to configurations where par- 
tons lie close together in the (77, </>) plane at the expense of increased errors in the partonic 
momentum measurement due to fluctuations in the QCD radiation lying outside the cone. 

Cone algorithms are not at all straightforward to design, nor even to describe, because of 
ambiguities in how to treat situations in which jets overlap. Overlap occurs to some degree 
whenever two jet axes lie within 2 R of each other in (rj, <fi), which happens in the majority 
of events of the type we are considering. 

I have repeated the analysis of Section |I| with the k± algorithm replaced by a cone algo- 
rithm |21]] that be gins with clustering based on equivalence classes |22| . I have also repeated 



the analysis using a version of the cone algorithm by Seymour [|TTJ , which is patterned after 



current practice. A cone radius R — 0.7 was used in both cases. The results achieved by 
these two cone algorithms, which are alike in intent but very different in implementation, 
are strikingly similar to each other. 

Cone algorithms generally do not allow the final jet momenta to lie within R of each 
other. This leads to a significant loss of events in the top analysis, where the nearest pair of 
the four primary partons lie within 0.7 of each other in 20% of the events. It shows up quickly 
on repeating the analysis of Sect. [TT], in that 27% of the events for the algorithm of Ref. |[21|| , 



or 32% for the algorithm of Ref. JTTJ], are rejected because fewer than the required four jets 
with p± > 10 are found, as compared to only < 2% for the k± algorithm. Furthermore, the 
distribution of errors in the best fit to the partonic angles is broader for the cone algorithms 
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than for k±. 

For events in which the necessary four jets are found, both cone algorithms perform 
almost as well as the k± one. In particular, the final Myj distributions are quite similar 
to those shown in Figs. 8-9, especially for the cases in which realistic calorimeter energy 
resolution is included, which masks the differences. The average energy corrections needed 
for the cone algorithms are also similar to those for the k± algorithm, although slightly 
larger. 

One could therefore say that the k± algorithm provides only slightly better mass res- 
olution than the cone algorithms, but allows approximately 30% more events to be kept. 
Another way to compare the algorithms would be to impose a cut on the minimum sepa- 
ration between observed jets in (77, 0) for the k± algorithm, or to raise the p± threshold for 
defining jets in it, or to make a combination of such cuts that would make the fraction of 
events kept by the various algorithms the same. The benefits of the k± algorithm would 
then appear entirely in the form of improved mass resolution. 

The solid curve in Fig. 10 shows the fraction of events for which a good match is found 
between the 4 highest p± jets found by the k± algorithm and the 4 primary partons (using 
a criterion based on the quality of fit to the (77, 0) direction and p± of all four) as a function 
of the minimum separation between jets as observed by the algorithm. The algorithm is 
seen to have significant success even at minimum separations below 0.5. Meanwhile, the 
two versions of cone algorithm with R = 0.7 (dotted and dashed curves in Fig. 10) are 
somewhat less effective overall, and are completely unable to see separations smaller than 
the assumed cone size. A smaller cone size could be used to extend the effectiveness of the 
cone algorithms to smaller minimum separation, as CDF and DO have both done; but that 
would reduce the accuracy of the p± measurements, and hence reduce the overall fraction 
of good matches. (As an aside, the curves shown in Fig. 10 are seen to turn over at large 
minimum separation. This may at first sight be puzzling, but it only reflects the fact that 
large separation between all 6 pairs of partons is very unlikely, so if the jet finder sees such 
a configuration, it is likely to be mistaken.) 
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In setting up the definitive top quark data analysis, the best choice of cuts on minimum 
jet-jet angular separation and minimum jet p± will have to be determined using a full 
simulation of both the detector and the complete analysis procedure. Optimal choices for 
the cuts for the purpose of mass measurement will also depend on the number of events 
available for analysis, since one can afford statistically to cut harder when there are more 
events to begin with. 

Another way to compare the k± and cone algorithms was carried out to study the ability 
to analyze objects that decay into two jets in the presence of additional jets, which will be 
necessary in the Higgs boson search. For this study, ti events were generated as before except 
for an additional cut requiring the partons from W decay to be separated from each other 
by > 1.0 in the (77, 0) plane. This cut is minor because these jets tend to be opposite each 
other in azimuthal angle and hence well separated. The ideal calorimeter model was used. 
The events were analyzed as before except that all jets found by the jet finder were kept 
and there was no requirement that four or more jets be found. The pair of jets (at least two 
jets were always found) making the best fit in (77, 0) to the two partons from W decay were 
identified. Linear jet energy corrections were applied as before to these jets. The invariant 
mass of the pair was computed and corrected for the deviation of the partonic W mass from 
its nominal value, to remove the effect of finite W width that is included in the simulation. 
The resulting distribution in dijet mass is shown in Fig. 11 for the k± algorithm and the two 
versions of cone algorithm. The distributions are normalized to the same number of events, 
so the superiority of the k± method is demonstrated by the fact that its peak is significantly 
higher. This is true even though the width of the peak — measured by full width at half 
maximum above background or "by eye" — is not obviously better. The point is that many 
events are so clean that all three jet finders give almost identical results for them. This 
can be seen in Fig. 12, which shows the distribution of the total root-mean-square deviation 
between the two jets identified as coming from the W and their true parton directions, i.e. 
the quantity that was minimized to identify the "correct" jet pair. Compared to the k± 
algorithm, the two cone algorithms both have relatively strong tails into a region of large 
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deviation where the W decay axes have not been located very well. These tails result mainly 
from events in which the jet finder includes contributions to a W decay jet from particles 
actually coming from a b jet that happens to lie nearby in the (77, 0) plane. This explains 
the tails extending toward higher My in Fig. 11. The k± algorithm is less easily confused 
by such particles. 



IV. NEUTRINO MOMENTUM DISTRIBUTIONS 

Figs. 8-9 show that there is a substantial loss in mass resolution caused by fluctuations in 
the neutrino component of b jets. To study this in more detail, Fig. 13 shows the distribution 
of the observable (i.e., non- neutrino) fraction of jet momentum 

Z = l - p Neutrinos/ p Parton (g) 

for fe-jets that contain at least one neutrino. The log-log plot reveals that the distribution 
can be rather well approximated by a power law: dP/dz oc z A with A = AA for z < 0.98 . 
The dotted curve in Fig. 13 shows the distribution for the subset of jets that contain an e ± 
or jj^ with p± > 2, which may be detected experimentally — especially in the case of /i ± . 
The two distributions are nearly identical. Distributions with stronger or weaker cuts on 
the of or ^ , or with cuts on pP arton ; are also about the same. 

We can use this power law over the entire range < z < 1 because the neutrino contri- 
bution to p± is small compared to other errors in jet energy measurement in the tiny region 
0.98 < z < 1 where the power law doesn't fit well. Including the contribution from jets 
without neutrinos then gives a normalized parametrization of the distribution in observable 
momentum fraction 

dP 

^ = f5{z- !) + {!- f)hAz' A (9) 

where / is the fraction of jets with negligible or zero neutrino contribution. For all b jets, 
/ = 0.59 which implies that 23% of them hide > 10% of their momentum in neutrinos and 
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12% of them hide > 20%. For the 33% of b jets that contain an electron or muon with 
> 2, / is only 0.10 which implies that 51% of them hide > 10% of their momentum 
in neutrinos and 27% of them hide > 20%. It is thus clearly advantageous to use different 
estimates to correct for the missing neutrino energy in a b jet, depending on whether or 
not a lepton is observed in the jet. This has already been done in the analysis of the top 
quark signal [^3j. A topic worthy of future study would be to see if any further details of 
the observed jet, in addition to the mere presence or absence of a lepton, can be used to 
further improve the neutrino momentum estimate. 

It is interesting that the distribution in missing neutrino energy fraction when a lepton 
is observed is nearly independent of the energy of that lepton, except for the difference 
in probability that the missing energy is negligible or zero. Additionally, the probability 
distribution for the error in jet momentum measurement is very asymmetric and very far 
from gaussian. This should be taken into account in the tt final state reconstruction analysis. 



V. DIRECT COMPARISON OF MASS DISTRIBUTIONS 

So far, we have compared jet algorithms by making explicit use of the original parton 
momenta to infer the correspondence between jets and partons. This facilitates a detailed 
comparison of the methods, but it is somewhat artificial, since it can never be carried out 
using real data for which underlying partonic information is unknown. In this section we 
compare the jet algorithms directly, using no information that exists only in the world of 
Monte Carlo. 

An appealing way to make the comparison would be to simulate a full data analysis 
recommended for tt events, and see how the choice of jet algorithm affects the uncertainty 
in measuring m t . The treatment of measurement errors in that analysis, however, is very 
complicated; and further complications arise from the role played by missing p± in identifying 
the leptonically decaying W, and from the existence of a variety of classes of events with 
regard to 5-tagging information (zero, one, or two tags with varying degrees of certainty). 
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The complete comparison can therefore only be done properly by the experimentalists who 
are in a position to use full simulations of the detector, and who can make the comparison 
with data as well as with Monte Carlo events. 

In order to test our methods directly but without carrying out the full ti analysis, 
HERWIG events were generated as before except that all parton-level cuts were removed. 
The intermediate model of the calorimeter was used, i.e., energy resolution was included, 
but neutrinos were assumed to be observable. Instead of using partonic information to infer 
the jet assignments, a trijet mass distribution was found by simply plotting a histogram of 
Mjjj formed from each subset of 3 of the 4 highest p±_ jets. Events with fewer than 4 jets were 
ignored. The minimum jet p± was chosen slightly differently for the different jet algorithms 
to make the fraction of events kept the same for each algorithm. 

The histograms of are shown in Fig. 14. The k± algorithm (solid curve) produces a 
clear peak above the combinatoric background. That background is very large because even 
events that are analyzed correctly contribute three incorrect combinations to the histogram 
in addition to the correct one. The two cone algorithms (dashed and dotted curves) produce 
nearly identical results. They show a peak that is significantly smaller and broader than the 
result of the k± algorithm. 

In a full analysis, 6-tagging and the constraint from the hadronic W decay mass would 
greatly reduce the combinatoric background is Fig. 14, and accentuate the difference between 
the methods. The signal peaks would also be slightly narrower because different jet energy 
corrections could be made for the b quark and light quark jets, in place of the cruder method 
of just making an average correction for all jets, which was used in generating Fig. 14. 

VI. CONCLUSIONS 

We have seen that a form of the k± successive recombination jet algorithm offers a sig- 
nificant improvement in the fraction of ti events that can be reconstructed and/or offers 
significantly improved t mass resolution at the same efficiency, compared with cone algo- 
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rithms like those that have been used up to now for ti data analysis. The basis of this is 
the flexibility of the kj_ algorithm with respect to jet radius: it can include final particles 
in a cone as large as R = 1 or even greater when possible, while maintaining some useable 
efficiency for resolving jets down to as close as R = 0.2 . The improved mass resolution that 
can be obtained using jet angle variables in the top rest frame |I|] has also been confirmed. 
The size of these improvements and the importance of an accurate top quark mass measure- 
ment are such that the procedure should be carried out in spite of the considerable work 
that will be necessary to reevaluate the instrumental corrections using the new methods. 

The particular form of the k± algorithm advocated here is characterized by a simple rule 
for when to terminate the process of combining protojets into jets, as described explicitly 
in Sect. [II Q The dependence on parameters appearing in the algorithm is discussed in 



Sect. H E| . With this algorithm, the mass resolution is close to optimal in the sense that the 
majority of the width of the final mass peak is generated by the nominal energy resolution 
of a typical detector, so not much further improvement is theoretically possible. 

We have seen that fluctuations in the momentum carried by neutrinos contributes signif- 
icantly to the error in measuring the momentum of a b jet. This error is reduced in current 
practice by using different distributions according to whether or not a lepton is identified 
in the jet. A matter for future study is to see if any other features of the observed jet can 
be used to further improve the estimate. 

Finally, both the improved jet algorithm and the improved estimate of neutrino contri- 
butions can help also in the search for other heavy objects that decay into jets, such as Higgs 
boson -> bb ||. 
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FIG. 2. Distribution of the ratio of observed jet transverse momentum (pf ct ) to original parton 
transverse momentum (j?P arton ) i n t — ► bW — > fogg for quarks from decay (solid) and 6 quarks 
(dotted) at pP arton ~ 50GeV/c, for the ideal calorimeter model. 
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FIG. 4. Like Fig. 3 except that the calorimeter is blind to neutrinos, which is realistic unless 
the neutrino component can be estimated from leptonic information. 
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FIG. 5. Three solid curves for W decay jets and three dotted curves for b jets show the 16 th , 
50 th , 84 th percentile points (i.e., the middle 68%) for the distributions of p^ et — p^ arton as a function 
of pP arton . The calorimeter model is the ideal one as in Fig. 2. 
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FIG. 10. Fraction of events for which a good match is found between the 4 highest p± observed 
jets and the 4 primary partons according to a criterion based on agreement in both angle and 
energy, as a function of the minimum separation in (77, <p) between pairs of observed jets. Solid 
curve is for the k± algorithm. Dashed and dotted curves are for the two versions of cone algorithm 
( g, dotted 0). 
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FIG. 11. Dijet mass distribution from W decays identified by k± algorithm (solid) or cone 
algorithms (dashed [pi]] , dotted []IT| ). 
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FIG. 12. Distribution of total rms deviation in (77, (f>) of best-fitting dijet pair to W decay 
partons using k± algorithm (solid) or cone algorithms (dashed plj, dotted |0|) as in Fig. 11. 
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FIG. 14. Trijet mass distributions formed from each 3 of the 4 highest p± jets observed in each 
event (4 combinations per event), using the k± algorithm (solid) or cone algorithms (dashed f2l"fl , 
dotted Q). 
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